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FOREWORD 


The  Leadership  and  Management  Technical  Area  of  the  U.S. 

Army  Research  Institute  (ARI)  conducts  programmatic  research  to 
improve  leader  effectiveness,  with  a  focus  on  the  sequential, 
progressive  development  of  leaders.  To  support  this  and  other 
research,  ARI  is  developing  an  Officer  Longitudinal  Research  Data 
Base  (OLRDB)  along  with  an  online  user's  manual  and  data  diction¬ 
ary  stored  in  the  ARI  VAX  computer-  The  data  base  will  enable 
researchers  to  produce  data-based  information  on  officer  train¬ 
ing,  professional  development,  and  utilization. 

The  online  OLRDB  manual  was  developed  to  facilitate  access 
to  the  data  base  and  to  provide  an  efficient  means  of  incorporat¬ 
ing  changes  resulting  from  regular  updating  of  the  data  base. 

This  "hard  copy"  of  the  user's  manual  has  been  prepared  to  serve 
as  a  general  introduction  to  the  data  base.  This  version  con¬ 
tains  the  same  material  as  the  online  manual,  less  the  data  dic¬ 
tionary.  It  is  intended  to  help  researchers,  especially  those 
without  ready  access  to  the  ARI  VAX,  to  assess  the  potential  use¬ 
fulness  of  the  data  base  for  their  research. 

The  development  of  the  OLRDB  has  been  briefed  to  the  re¬ 
search  sponsor,  the  Center  for  Army  Leadership  (29  April  1987), 
which  recognizes  its  role  as  a  research  tool  to  generate  infor¬ 
mation  necessary  for  systematic  enhancement  of  leader  training 
and  effectiveness. 


EDGAR  M.  JOHNSON 
Technical  Director 
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IMPORTANT  NOTES 


about  the  "Hard  Copy"  of  the 
U.S.  Army  Research  Institute 
Officer  Longitudinal  Research  Data  Base  (OLRDB) 

User's  Manual 


The  U.S.  Army  Research  Institute  (ARI)  Officer  Longitudinal 
Research  Data  Base  (OLRDB)  will  be  updated  on  a  regular  basis  to 
enable  research  on  the  current  Army  officer  population,  as  well 
as  analyses  requiring  historical  tracking  of  information.  The 
online  User's  Manual  was  developed  to  facilitate  access  to  the 
OLRDB  data.  It  also  provides  efficient  means  to  update  the  Data 
Dictionary  with  new  information. 

This  "hard  copy"  of  the  User's  Manual  has  been  prepared  to  serve 
as  a  general  introduction  to  the  data  base,  primarily  for 
researchers  without  ready  access  to  the  ARI  VAX.  It  is  intended 
to  help  them  assess  the  potential  usefulness  of  the  data  base  for 
their  work.  This  version  contains  the  same  material  as  the 
online  manual,  less  the  Data  Dictionary.  Researchers  who  wish  to 
examine  the  Data  Dictionary  need  to  do  so  by  accessing  the  online 
manual  in  the  ARI  VAX  computer. 

The  sections  in  the  "hard  copy"  manual  are  not  organized  in  the 
standard  report  format.  Thev  do,  however,  represent  an  exact 
copy  of  the  online  manual,  intended  to  assure  comparability  for 
users  referring  to  the  online  and  the  "hard  copy"  versions. 

The  readers  of  the  "hard  copy"  are  reminded  to  note  the  date  of 
this  preface  and  inquire  about  an  updated  version,  if  the  date 
appears  old. 


Fumiyo  T.  Hunter 
OLRDB  Project  Manager 

AV  284-8293 
(202)  274-8293 

September  1987 
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OFEICER  LONGITUDINAL  RESEARCH  DATA  BASE  (OLRDB) 

USER'S  MANUAL 


INTRODUCTION 


The  Online  User's  Manual  for  the  Officer  Longitudinal  Research 
Data  Base  (OLRDB)  was  designed  to  facilitate  ARI  researchers' 
access  to  data  in  the  OLRDB-  The  manual  consists  of  two  parts. 
The  first  is  the  text  file  that  you  are  now  reading  that 
describes : 

(1)  the  data  base 

(purpose/  contents/  organization/  how  to  request  data, 
references )  , 

(2)  how  to  use  the  Data  Dictionary,  and 

(3)  how  to  run  a  job  using  data  from  the  data  base. 

The  second  part  is  the  Data  Dictionary.  The  dictionary  provides 
information  about  each  data  element,  or  variable,  in  the  OLRDB, 
including  which  data  set  contains  the  variable,  description  of 
the  variable,  how  it  is  coded,  and  the  descriptions  of  the  codes. 

To  use  the  Data  Dictionary  and  the  OLRDB,  a  researcher  should 
first  become  familiar  with  the  information  in  the  User's  Manual. 
Then,  by  looking  through  the  Data  Dictionary  (see  "How  to  Use  the 
Data  Dictionary"  section  of  this  Manual),  one  can  determine  the 
availability  of  desired  information  in  the  OLRDB.  Once  the 
researcher  selects  the  OLRDB  data  elements  to  be  analyzed,  the 
Data  Request  Form,  shown  later  in  this  Manual,  will  be  submitted 
to  the  Data  Base  Manager.  The  examples  in  the  "How  to  Run  a  Job" 
section  will  assist  the  researcher  in  compiling  the  computer 
commands  needed  to  construct  a  working  file  and  analyze  data 
using  the  Statistical  Analysis  System  (SAS)  at  the  National 
Institutes  of  Health  (NIH)  computer  facility.  Researchers  using 
the  NIH  computer  would  also  be  able  to  apply  other  statistical 
packages  such  as  SPSS  and  BMDP  to  SAS-format  working  files. 

Knowledge  of  statistical  packages,  particularly  SAS,  is  helpful. 
Understanding  how  to  use  the  NIH  computer  is  also  helpful.  Full 
instruction  in  these  areas  is  beyond  the  scope  of  this  User's 
Manual,  but  excellent  documentation  is  available  on  both  SAS  and 
the  NIH-IBM  system.  Courses  are  also  available  on  these  topics. 
If  you  need  additional  information  or  have  any  comments  about  the 
accuracy,  readability,  or  helpfulness  of  the  User's  Manual, 
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contact  the  OLRDB  Manager--Fumiyo  Hunter  (202-274-8293,  AV  284- 
8293,  ARI  VAX  Username  =  OLRDB). 


DESCRIPTION  OF  THE  OLRDB 


Purpose 

The  purpose  of  the  Officer  Longitudinal  Research  Data  Base 
(OLRDB)  is  to  make  personnel,  pre-  and  post-commissioning 
training,  and  field  performance  data  on  U.S.  Army  officers 
available  for  research.  The  OLRDB  was  constructed  to  provide 
efficient  access  to  these  data  with  the  best  possible 
documentation.  Historical  data  from  various  Army  agencies  have 
been  incorporated  into  several  SAS  data  sets,  organized  by 
similarity  of  contents,  with  variable  and  code  labels.  This 
Online  User's  Manual  and  a  comprehensive  Data  Dictionary  are 
provided  to  assist  researchers  in  using  the  data  base  most 
effectively. 

Contents 


1.  Officer  Personnel  Data.  The  OLRDB  began  by  converting  yearly 
"snapshots"  (1979-1985)  of  the  Officer  Master  File  (OMF)  compiled 
by  U.S.  Army  Military  Personnel  Center  into  SAS  data  sets  and 
describing  these  data  in  the  Data  Dictionary. 

Personnel  data  from  the  Defense  Manpower  Data  Center  (DMDC) 

Master  and  Loss  File,  containing  separation  information  for  the 
period  of  1970-1985,  were  used  to  cross  check  the  accuracy  of 
data  from  the  1979-1985  OMF  and  to  obtain  OMF-equi valent  data  for 
1970-1978.  The  OLRDB  Core  Data  Set  was  then  constructed  to  (1) 
provide  an  accurate,  comprehensive  list  of  commissioned  officers 
on  active  duty  during  the  period  of  1970-1985,  (2)  provide  one 
data  set  with  the  most  commonly  used  data,  and  (3)  create  an 
encrypted  list  of  identifying  information  to  make  it  possible  to 
link  individual  data  from  various  data  sets  while  maintaining  a 
level  of  privacy. 

All  OLRDB  data  sets  are  updated  yearly.  Thus,  all  commissioned 
officers  on  active  duty  are  covered  for  the  period  1970  to 
present.  The  majority  of  cases  are  in  the  Regular  Army  or  Active 
Duty  Reserve  forces. 

2.  Pre-Commissioning  Data.  Data  from  precommissiong  sources 
such  as  Reserve  Officers'  Training  Corps  (ROTC) ,  Officer 
Candidate  School  (OCS) ,  and  the  U.S.  Military  Academy  (USMA)  are 
being  added  as  they  are  obtained.  ROTC  Advanced  Camp  performance 
and  Commissioning  File  data  collected  during  1982-1985  have  been 
included  into  the  OLRDB.  The  USMA  and  OCS  data  sets  will  be 
established  in  1987-1988. 


3.  Post-Cominiss ioninq  Data.  Information  from  the  Automated 
Instructional  Management  System  (AIMS),  a  TRADOC  data  management 
system  containing  Officer  Basic  and  Advanced  Course  grades,  will 
be  incorporated  during  1987-1983.  Leadership  effectiveness  and 
unit  performance  data  from  field  training  exercises  and  garrison 
settings  will  be  added  as  they  become  available. 

Organization  of  Data 

The  OLRDB  data  reside  on  a  set  of  magnetic  tapes  at  the  National 
Institutes  of  Health  computer  facility  with  an  IBM  mainframe, 
while  the  User's  Manual  and  Data  Dictionary  are  on-line  on  the 
ARI  VAX  computer. 

The  OLRDB  can  be  thought  of  as  one  big  data  set  divided  into 
pieces.  A  simple  view  might  look  like  this: 


Variables 


Individuals  Sex 

Rank 

Branch 

Marital 

Date  of 

Status 

Commission 
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CPT 

AR 
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14  JUN 

74 
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F 
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AR 
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03  AUG 

71 

3 

M 

MAJ 

IN 
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27  JAN 

71 

4 

M 

LTC 

IN 

S 

18  DEC 

68 

5 

F 

MAJ 

MI 
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09  NOV 

73 

6 

M 

2LT 

FA 

D 

23  JUN 

85 

Each 

individual  has  one 

record 

containing  all  available 

data  : 

that 

individual . 

However,  because  of  the  large 

number 

of 

variables  (>1K), 

the  variables 

have  been  divided  into  18  data 

sets 

which  can  be 

merged 

together  for  each  individual. 

as  of 

September  1987. 

All  but 

one  of 

these  data  sets 

are  in 

SAS 

format,  with  variables  and  data 

values  already 

labeled . 

1 .  ; 

Personnel  Data  Sets. 

Data 

from  the  OMF  and 

Master 

and  Lo: 

File' 

are  divided 

into  14 

groups 

(such  as  biographical  data. 

assignment  history,  promotion  history,  military  and  civilian 
education,  branch/ functional  area/skills,  and  awards)  and  stored 
on  14  magnetic  tapes.  Each  of  these  tapes  contains  the  most 
recent  information  on  all  active  duty  records  from  1979  to 
present.  For  officers  who  have  separated  during  this  period,  the 
information  at  the  time  of  separation  is  retained.  For  officers 
presently  on  active  duty,  the  most  current  information,  updated 
yearly,  is  retained  in  the  SAS  data  sets. 

The  14  data  sets  described  above  collectively  contain  most  of  the 
variables  in  the  OMF.  In  contrast,  the  two  core  data  sets 
include  selected  OMF  variables  frequently  used  for  research  and 


contain  the  most  rigorously  verified  listing  of  active  duty  cases 
from  1970-present.  The  first  of  tnese  is  the  SAS  Core  Data  Set. 
Like  the  other  14  SAS  data  sets,  the  SAS  Core  Data  Set  contains 
the  most  recent  information  available  for  each  record.  As 
personnel  information  on  individuals  changes  over  the  years,  the 
new  information  replaces  the  old  in  these  data  sets.  (See  ARI 
Research  Product,  entitled  Development  of  Core  Data  Set  of  the 
Officer  Long i tutidnal  Research  Data  Base,  1987,  by  D.  Younkman, 

Fu  Associates,  Ltd.) 

The  second  core  data  set  is  the  Longitudinal  Data  Set,  the  only 
OLRDB  data  set  not  in  the  SAS  format.  This  data  set  contains  the 
same  variables  as  the  SAS  Core  Data  Set,  but  in  a  "cleaned"  up, 
raw  form.  In  addition,  this  data  set  repeats  the  same  set  of 
variables  (with  different  values  when  they  change)  for  each  year 
from  1970  to  present.  This  feature  permits  tracking  of  changes 
in  the  values  of  any  given  variable  over  the  years  and  the 
approximate  timing  of  the  changes.  It  also  allows  application 
of  computer  programs  other  than  SAS. 


LONGITUDINAL  DATA  SET  (Contains  yearly  core  data  set  variables) 


1  1970 

1971 

1  1972 

1973 

1974 

1975  I 

1  DATA 

1. 

DATA 

1  DATA 
-1 

DATA 

DATA 

DATA  1  . 

2.  Pre-Commissioning  Training  Data.  There  are  2  ROTC  data  sets 
in  SAS  format,  one  containing  the  ROTC  Advanced  Camp  performance 
records  and  the  other,  the  final  program  records  (Commissioning 
File  information)  from  1982  to  present.  (See  ARI  Research 
Product,  entitled  Development  of  ROTC  Data  Sets  and  Evaluation  of 
Their  Usefulness  for  Officer  Longitudinal  Research  Data  Base, 
1987,  by  D.  Younkman,  Fu  Associates,  Ltd.)  Unlike  the  personnel 
data  sets  in  which  the  same  records  are  updated  yearly,  the 
training  data  sets  contain  different  sets  of  individuals  in  each 
yearly  segment.  For  example,  the  Advanced  Camp  data  for  Officer 
X  would  be  found  in  the  Advanced  Camp  data  set  for  the  year  that 
he/she  attended  the  Advanced  Camp. 


The  academic  records  from  the  U.S.  Military  Academy  (USMA)  and 
Officer  Candidate  School  (OCS)  will  be  structured  similarly  to 
the  ROTC  data  sets  and  integrated  into  the  OLRDB  in  1987-1988. 

3.  Post-Commissioning  Training  Data.  As  of  June  1987,  work  was 
underway  to  collect  the  Officer  Basic  and  Advanced  Course  (OBC, 
OAC)  performance  data  from  the  Automated  Instructional  Management 
System,  Training  and  Doctrine  Command,  and  to  integrate  them  into 
the  OLRDB.  When  completed,  these  data  will  be  organized  by 
branch  school,  course  (OBC  and  OAC),  and  year.  As  with  the  pre¬ 
commissioning  data  sets,  an  individual's  performance  in  OBC  or 
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OAC  will  be  entered  once,  for  the  year  the  course  is  taken,  and 
not  revised  yearly.  However,  some  individuals  will  be  included 
in  the  OBC  data  set  and,  later,  in  the  OAC  data  set. 

4.  Data  Dictionary.  There  is  one  online  Data  Dictionary 
describing  data  elements  from  all  OLRDB  data  sets,  with 
specifications  as  to  which  data  set  and  tape  contain  each  data 
element.  One  can  query  the  Data  Dictionary  to  list  names  of  all 
data  sets,  all  variables  in  a  given  data  set,  all  data  values  for 
a  given  data  element,  etc.,  and  print  out  any  part  of  these  lists 
as  hard  copies.  Additional  considerations  might  be:  In  what 
years  was  particular  information  collected?  Are  there  known 
problems  with  the  variables  or  codes?  Are  there  related 
variables  a  researcher  might  want  to  consider?  The  Data 
Dictionary  contains  answers  to  many  of  these  questions.  See 
instructions  for  entering  the  Data  Dictionary  at  the  end  of  this 
User's  Manual. 

If  researchers  know  what  variables  they  are  looking  for,  they  can 
enter  the  Data  Dictionary,  examine  the  descriptions  and  codes  for 
those  variables,  and  determine  the  usability  of  the  data  and  what 
data  set(sj  are  needed.  Queries  on  keywords  or  codes  are 
possible  if  the  variables  are  not  known  in  advance.  One  can  also 
scan  the  data  sets,  variables,  and  codes  to  get  an  idea  of  what 
data  are  included.  Instructions  for  querying  the  Data  Dictionary 
and  examining  variables  are  provided  once  you  have  entered  the 
Data  Dictionary. 

Accessing  and  Merging  Data  Sets 

Researchers  with  NIH  accounts  may  access  the  OLRDB  data  sets  at 
the  NIH  computer  facility.  The  OLRDB  data  tapes  can  be  read  but 
are  protected  from  further  manipulation.  The  standard  procedure 
for  users  is:  (1)  to  extract  from  the  data  sets  the  variables 

selected  for  a  particular  research  need,  (2)  to  create  a  working 
file  of  the  extracted  data  to  be  stored  under  the  researchers' 
account,  and  (3)  to  scratch  the  working  file  when  the  necessary 
analyses  are  completed,  unless  yearly  update  and  continued  use  of 
the  working  file  is  planned.  Researchers  are  responsible  for 
creating  the  working  file  and  using  it  in  accordance  with  the 
requirements  that  are  specified  in  the  Data  Request  Form  shown 
later . 

To  protect  the  privacy  of  individuals,  social  security  numbers 
have  been  encrypted  and  other  personal  identifiers  have  been 
stripped  from  the  OLRDB  data  sets,  and  a  linking  file  of 
encrypted  information  has  been  constructed.  Each  record  in  each 
data  set  contains  an  encrypted  personal  identifier  (MATCHCOD). 
Variables  from  several  OLRDB  data  sets  can  be  merged  together  by 
the  matchcode  to  create  a  working  data  set  with  just  the  data 
needed  for  a  particular  research  project. 


A  most  powerful  aspect  of  the  OLRDB  is  that  officer  data  from 
other  research  projects  can  be  merged  with  the  OLRDB,  greatly 
expanding  the  scope  of  possible  analysis  and  multiplying  the 
usefulness  of  research  efforts.  However,  individual  social 
security  numbers  would  be  needed  to  merge  OLRDB  data  sets  with 
non-OLRDB  data.  Requests  for  converting  the  OLRDB  Matchcodes  to 
social  security  numbers  should  be  addressed  to  the  OLRDB  Manager 
via  OLRDB  Data  Request  Form  discussed  later. 


Important  Reminders 

It  is  incumbent  on  the  researcher  to  carefully  screen  the  Data 
Dictionary  to  understand  clearly  what  variables  are  available  and 
wnat  each  means.  Not  all  data  are  available  on  each  individual. 
Some  variables  were  not  collected  for  every  year;  coding  of  some 
variables  changed  over  the  years;  and  historical  records  provide 
less  information  the  further  back  one  looks. 

Currently,  inclusion  of  records  in  the  OLRDB  is  limited  to 
commissioned  officers  in  active  duty,  including  those  in  the 
Regular  Army,  Active  Duty  Reserve,  and  Army  of  the  U.S.  Non¬ 
commissioned  and  warrant  officers,  as  well  as  reserve  officers 
not  in  active  duty,  are  excluded. 

If  users  have  questions  about  the  OLRDB  or  the  documentation, 
they  are  encouraged  to  contact  the  OLRDB  Manager,  currently 
Fumiyo  Hunter,  202-274-8293,  AV  284-8293,  or  send  e-mail  messages 
to  ARI  VAX  Username  =  OLRDB. 


How  to  Request  OLRDB  Data 

After  a  researcher  selects  the  OLRDB  variables  to  use,  he/she 
must  complete  the  following  Data  Request  Form  and  submit  it  to 
the  Data  Base  Manager,  Fumiyo  Hunter,  Army  Research  Institute, 
Leadership  and  Management  Technical  Area,  PERI-RL,  5001 
Eisenhower  Avenue,  Alexandria,  VA  22333-5600.  The  form  will 
serve  two  purposes:  (1)  to  keep  records  of  OLRDB  use  (who  uses 

it,  frequently  used  types  of  data,  etc.)  and  (2)  to  assure 
adherence  to  the  requirements  of  the  Privacy  Act.  The  entire 
Data  Request  Form  is  presented  below.  A  researcher  may  print  out 
the  form  from  the  User's  Manual  or  request  a  hard  copy  from  the 
Data  Base  Manager. 
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REQUEST  TO  ACCESS/TRANSFER  DATA 


FROM  OFFICER  LONGITUDINAL  RESEARCH  DATA  BASE  (OLRDB) 


GENERAL  INFORMATION 

1.  Use  of  data  contained  in  the  Officer  Longitudinal  Research 
Data  Base  (OLRDB)  is  hereby  requested. 

2.  I  have  read  and  understood  the  information  contained  in  the 
on-line  OLRDB  User's  Manual  provided  on  the  Army  Research 
Institute  VAX  computer. 

3.  It  is  understood  that  OLRDB  contains  the  type  of  personal 
information  on  U.S.  Army  officers  covered  by  the  1974  Privacy  Act 
and  is  intended  for  research  use  primarily  by  DoD  researchers. 
Researchers  using  the  data  base  are  responsible  for  preventing 
disclosure  of  any  specific  information  about  any  particular 
individual  to  anyone,  including  the  individual  him/herself.  Army 
agencies,  and  any  other  person  or  organization. 

4.  It  is  understood  that  the  user  will  construct  a  working  file 
of  selected  cases  and  data  elements  from  the  OLRDB  and  store  the 
worlting  file  under  his/her  account.  Exceptions  will  be  made  for 
users  with  no  NIH  account,  from  distant  locations.  Such  users 
may  request  a  transfer  of  data  to  other  computer  facilities  by 
magnetic  tape.  The  OLRDB  data  sets  are  to  be  "Read-Only"  and  to 
serve  as  the  source  of  working  files. 

5.  Any  publication  resulting  from  research  using  the  OLRDB  data 
will  include  an  acknowledgement  of  the  OLRDB,  and  a  copy  of  such 
publication  will  be  provided  to  the  OLRDB  Manager. 


DESCRIPTION  OF  RESEARCH  PROJECT 


SPECIFICATION  OF  DATA  REQUESTED 

8.  The  OLRDB  data  will  be  accessed  by: 

a.  Direct  reading  of  the  SAS  data  sets  stored  at  the  NIH 
computer  facility 

_  No.  If  no,  go  to  8. b. 

_  Yes.  If  yes,  complete  S.a.l.  below. 

8.a.l.  The  working  file  will  be  downloaded  to  computer 
system  other  than  the  NIH  system. 

_  No,  it  will  remain  at  NIH 

Yes,  it  will  be  downloaded  to: 


b.  Request  data  extracts  to  be  transferred  to  another 
computer  facility  by  magnetic  tape(s).  The  tape  with  OLRDB 
data  extracts  should  be  mailed  to: 


Name : 


Address : 


Phone  Number  : _ _ 

Tape  specifications: 

Track/BPI:  _ 

_  EBCDIC  _ ASCII 

Labeled  Non-labeled 


Other : 


9.  The  data  elements  to  be  extracted  into  a  working  file  or  onto 
a  tape  are  listed  on  the  Attachment. 


10.  The  OLRDB  records  need  to  be  identified  with  social  security 
numbers  to  merge  them  with  data  from  other  sources. 


Yes.  If  yes,  the  Data  Base  Manager  will  convert  the 


encrypted  Matchcode  for  requested  OLRDB  cases  and  data  sets  into 
social  security  numbers. 


No 


11.  If  the  request  calls  for  a  working  file  with  the  encrypted 
Matchcode  converted  into  social  security  numbers  to  merge  with 
non-OLRDB  data,  it  is  understood  that  once  social  security 
numbers  are  used  to  merge  data,  the  following  steps  will  be  taken 
to  reduce  possibilities  of  accidential  disclosure  of  personal 
information : 

a.  Each  record  will  be  assigned  a  unique  sequence  number. 


b.  An  individual's  social  security  number  will  be  associated 
with  the  sequence  number. 


c.  A  separate  file  containing  the  matched  social  security 
and  sequence  numbers  will  be  created  and  stored  in  a  separate 
file  with  appropriate  security  measures. 


d.  Personal  identification  information  used  for  merging  data 
sets,  including  social  security  numbers,  will  be  stripped  from 
the  working  file  of  the  merged  data. 

12.  Working  files  created  at  NIH,  ARI,  or  other  computer 
facilities  and  those  residing  on  tapes  will  be: 

_  Scratched,  when  the  current  use  is  completed. 

_  Saved  for  future  use.  Explain  the  nature  of  the  future 


13.  Future  updates  of  the  longitudinal  data  are  requested. 


Submitted  by; 


Signature: 
Organi zation : 


Date  : 


Submitted  to 


Name,  title 


Fumiyo  Hunter,  OLRDB  Manager 
Army  Research  Institute 
PERI-RL 

5001  Eisenhower  Avenue 
Alexandria,  VA  22333-5600 

(202)  274-8293,  AV  284-8293 


ATTACHMENT 


A.  Sample  specification  (e.g.,  ail  cases  in  the  OLRDB,  active 
duty  since  1980,  ROTC  graduates  only,  etc.): 


B.  OLRDB  Data  Elements  To  Be  Extracted 


SAS  DATA 
SET  NAME 


VARIABLE 

NAME 


1 


Continue  on  additional  sheets. 
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HOW  TO  USE  THE  DATA  DICTIONARY 


The  Data  Dictionary  is  the  key  to  using  the  OLRDB.  The  Data 
Dictionary  allows  researchers  to  examine  what  data  are  available 
and  whether  they  will  be  useful  for  specific  research.  To  use 
the  Data  Dictionary,  users  should  return  to  the  online  User's 
Manual  main  menu.  t^en  the  data-dictionary  option  is  selected 
from  the  main  menu,  a  list  of  all  OLRDB  data  sets  appears  on  the 
screen.  The  Data  Dictionary  menu  at  the  bottom  of  the  screen 
instructs  the  users  on  how  to  select  a  data  set  and  proceed  to 
the  list  of  variables,  variable  descriptions,  codes,  and  code 
labels.  It  also  contains  options  for  examining  other  data  sets 
or  returning  to  the  User’s  Manual  main  menu.  The  online 
instructions  and  menu  should  assist  users  to  review  the  OLRDB 
data  thoroughly.  However,  users  are  encouraged  to  contact  the 
OLRDB  Manager  with  any  questions  or  suggestions. 


HOW  TO  RUN  A  JOB 
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Running  a  job  at  NIH  will  require  the  following  pieces  of 
information  which  can  be  acquired  from  the  Data  Dictionary. 


1.  Variable  names. 

2.  The  name  of  the  OLRDB  SAS  data  set  which  contains  the 
variables  to  be  used. 

3.  The  tape  name  and  number  associated  with  the  SAS  data 
set . 

ITie  OLRDB  data  are  stored  at  NIH  on  tapes.  Each  tape  has  a  six 
digit  tape  number  and  a  name,  referred  to  in  the  JCL  as  a  DSN. 

For  storage  reasons  and  simplicity  of  design,  each  OLRDB  tape 
contains  one  OLRDB  data  set  with  the  respective  variables 
associated  with  that  data  set.  Remember  that  the  "tape  name"  and 
"SAS  data  set  name"  may  be  different;  the  tape  name  is  needed  for 
the  JCL,  the  data  set  name,  for  use  with  SAS  commands. 

The  information  listed  above  needs  to  be  "plugged  into"  the  JCL 
before  running  a  job  at  NIH.  From  the  Data  Dictionary,  the  user 
may  write  down  some  information  while  using  the  Dictionary  or 
print  out  the  lists  of  variables,  data  set  names,  tape  names,  and 
tape  numbers.  Instructions  for  printing  are  available  in  the 
Dictionary . 

ITie  general  procedure  for  using  the  OLRDB  data  will  be: 

^1)  extract  selected  variables  from  appropriate  data  sets, 

(2)  merge  data  from  two  or  more  data  sets,  if  needed, 

(3)  create  a  working  file,  registered  under  the  user's 
account ,  and 

(4)  store  the  working  file  until  the  completion  of  analytic 
work  AT  WHICH  TIME  IT  MUST  BE  SCRATCHED,  unless 
continuea  use  is  indicated  in  the  Data  Request  Form. 

Some  OLRDB  data  sets  contain  large  numbers  of  cases,  spanning 
many  years.  Many  of  them  may  not  be  applicable  for  a  particular 
analysis  purpose.  Selecting  only  the  necessary  cases  for  the 
working  file  before  proceeding  with  data  analysis  will  help  to 
reduce  computer  time  and  cost. 


Example  Job  Set-ups 

The  following  examples  are  basic  JCL  structures  that  show  some  of 
the  various  ways  of  accessing,  storing,  ana  using  the  OLRDB  data. 

Note:  Several  JCL  commands  are  necessary  for  jobs  using  the 

OLRDB  data  tapes.  They  are: 

/♦ROUTE  XEQ  TAPE 
/♦MESSAGE  tapenumber, R 
/♦ACCESS  WRZIKFD 

//SASLIB  DD  DSN=WRZ1KFD. OLRDB. FORMATS, DISP=SHR 


/*ROUTE  XEu  TAPE  is  needed  for  all  jobs  requiring  tape(s)  to  be 
mounted.  After  /*MESSAGE,  list  the  tape  number  and  a  code  " K" 
indicating  the  tape  is  to  be  read  but  not  written.  The  OLRDB 
tapes  are  specified  to  be  "read-only"  to  prevent  accidental 
alteration  of  the  data.  After  /*ACCESS,  list  the  NIH 
account/ initial  WRZlKFD  under  which  the  tapes  are  registered. 
//SASLIB  line  identifies  the  SAS  format  library  (a  computer  file) 
that  contains  formats,  or  variable  value  descriptions,  for  many 
of  the  OLPDB  variables.  The  variables  are  stored  in  the  SAS  data 
sets  along  with  these  formats.  Therefore,  attemps  to  access 
OLRDB  SAS  data  sets  without  referring  to  the  format  library  by 
this  JCL  statement  will  be  aborted. 


Example  1:  Create  a  working  file  of  selected  variables  on  disk 
and  analyze  data.  This  example  shows  the  JCL  and  procedure  that 
could  be  used  to  access  one  tape,  temporarily  store  the  variables 
selected  from  that  tape,  and  run  frequencies  on  the  variables. 

In  this  example,  the  variables  named  BZMAJ ,  BZLTC,  BZcOL,  and 
TGRA  are  being  pulled  from  the  data  set  named  RAN85A  (LINE  14). 
The  variables  are  contained  on  the  tape  titled,  V»RZ1KFD .  OMF .  SDSD 
(line  7).  The  information  on  these  variables  is  being 
temporarily  stored,  under  the  user's  account,  on  TMP002  (lines 
9/10J  in  the  data  set  BZRAN'K  (line  iB).  This  temporary  data  set 
will  be  scratched  from  the  disk  in  2-3  days.  ITie  TMP  disk  is 
used  only  for  short  term  (temporary)  storage.  After  the 
variables  have  been  pulled  from  the  source  tape  (*  034221)  and 
put  onto  a  disk  for  use  (in  this  example  TMP0O2),  the  analj-ses 
that  can  be  performed  are  almost  unlimited.  In  Example  1, 
frequencies  are  computed. 

1  //XXXEXl  JOB  (YYYY, 860, B) , KELLY 

2  /‘ACCESS  v;rzikfd 

3  //PROCLIB  DD  DSN=Z ABCRUN . PROCLI B , D1 SP=SER 

4  //STEPl  EXEC  SAS, REGION=4000K 

5  /‘ROUTE  XEQ  TAPE 

6  /‘MESSAGE  034221, R 

7  //FYALLl  DD  DSN=WRZ IKFD . OMF . SDSD , D1 SP=SHR, LABEL= ( 1 , SL )  , 

8  //  UN1T=TAPE, VOL=SER=(034221 ) 

9  //TEMP  DD  DSN=YYYYXXX.BZRANK,UNIT=FILE,D1SP=(NL’W,KEEP)  , 

10  //  VOL  =  SER=TMP002,  SPACE=(TRi;,  l5, 5)  ,  RLSE) 

11  //SASLIB  DD  DSN=WRZlKFD.OLRDB. FORMATS, DISP=SHR 

12  //SYSIN  DD  ‘ 

13  DATA  TEMP.BZRANK; 

14  SET  FYALLl . RAN85A(KEEP=BZMAJ  BZLTC  BZCOL  TGRA); 

15  PROC  FREO; TABLES 

16  BZMAJ ‘TGRA 

17  BZLTC‘TGRA 

18  BZCOL‘TGRA 

19  BZMAJ ‘BZLTC‘TGRA 

20  BZMAJ *BZCOL‘TGRA 

21  BZMAJ‘BZLTC‘BZCOL‘TGRA; 


15 


Elxample  2;  Create  and  store  selected  variables  on  disk  as  a 
permanent  file.  This  example  is  identical  to  Example  1  with  the 
exception  of  lines  9,  10  and  13.  By  storing  the  data  on  disk, 

the  user  will  have  a  permanent  file  which  can  be  accessed 
indefinitely.  However,  the  data  storage  costs  are  the  highest 
for  the  public  disks.  All  but  the  smallest  size  working  files 
should  be  stored  on  the  MSS  or  magnetic  tapes. 

1  //XXXEX2  JOB  (YYYY, 860, B) , KELLY 

2  /* ACCESS  WRZIKFD 

3  //PROCLIB  DD  DSN=Z ABCRUN . PROCLI B , DI SP=SHR 

4  //STEPl  EXEC  SAS, REGION=4000K 

5  /‘ROUTE  XEQ  TAPE 

6  /‘MESSAGE  004221, R 

7  //FYALLl  DD  DSN=WRZ IKFD . OMF . SDSD , D1 SP=SHR, LABEL= ( 1 , SL ) , 

8  //  UN1T=TAP£, VOL=SER= (004221 ) 

9  //PERM  DD  DSN=YYYYXXX. BZRANK, UN1T=F1LE, DISP={NEW, CATLG ) , 

10  //  VOL=SER=FILE44, SPACE=(TRK, (5, 51 ) , RLSE) 

11  //SASLIB  DD  DSN=WRZ1KFD.0LRDB. FORMATS, D1SP=SHR 

12  //SYSIN  DD  ‘ 

13  DATA  PERM. BZ RANK; 

14  SET  FYALLl . RAN85A(KEEP=BZMAJ  BZLTC  BZCOL  TGRA); 

15  PROC  FREQ; 

16  TABLES 

17  BZMAJ‘TGRA 

18  BZLTC*TGRA 

19  BZCOL*TGRA 

20  BZMAJ*SZLTC*TGRA 

21  BZMAJ*BZCOL*TGRA 

22  BZMAJ*BZLTC‘BZCOL*TGRA; 


Exeunple  3:  Select  variables  from  two  data  sets  and  store  merged 
working  file  on  Mass  Storage  System  (MSS).  The  example  below 
allows  the  user  to  use  variables  from  two  data  sets  (lines  8-11) 
on  tapes  and  store  the  output  permanently  on  the  MSS,  under  the 
user ’ s  account  (lines  12-13).  MSS  is  useful  for  storing  large 
working  data  sets  that  will  be  accessed  frequently.  The  user  has 
to  indicate  in  the  JCL  two  tape  numbers  (lines  9,  11),  and  two 
tape  names  (lines  8,  10).  Line  17  is  a  SAS  statement  indicating 

which  variables  and  respective  data  sets  are  to  be  used.  The 
KEEP  statement  will  read  and  store  the  variables  HEIGHT,  on  the 
data  set  named  BI085A,  and  TGRA  on  the  data  set  named  RAN85A. 
These  two  variables  will  be  on  the  newly  created  data  set  named 
HGTRANK  (line  16)  on  MSS  (lines  12-13). 

1  //XXXEX3  JOB  (YYYY, 860, B) , KELLY 

2  /‘ACCESS  WRZIKFD 

3  /‘ROUTE  XEQ  MSS 

4  //PROCLIB  DD  DSN=ZABCRUN . PROCLIB , DI SP=SHR 

5  //STEPl  EXEC  SAS, REG1ON=4000K 

6  /‘ROUTE  XEQ  TAPE 


i 

4 

i 

4 


I 
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11 
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15 
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17 

18 
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20 


/*Mi:SSAGE  012739,  R;034221,  R 

//FYALLl  DD  DSN=WR2 IKFD . OMP . SDSA, D1 SP=SHR, LABEL= ( 1 , SL) , 

//  UNIT=TAPE, VOL=SER=(012739) 

//FYALL2  DD  DSN=WR2lKFD.OMF. SDSD, D1SP=SHR, LABEL= a , 6E) , 

//  UNIT=TAPE, VOL=SER=(034221 ) 

//PERM  DD  DSN=YYYYXXX.HGTRANK,UNIT=MSS,DlSP=(NEVv,CATLG)  , 
//  SPACE=(CYL, (5, 5 ) , RLSE) 

//SASLIB  DD  DSN=WRZlKFD.OLRDB. FORMATS, DISPOSER 
//SYSIN  DD  * 

DATA  PERM.HGTRANK; 

MERGE  FYALLl. BI085A(KEEP=HEIGHT)  FYALL2 . RAN85 A ( KEEP=TGRA ) ; 
BY  MATCHCOD; 

PROG  FREQ; 

TABLES  HEIGHT*TGRA; 


Once  the  user  has  stored  information  on  a  temporary  file,  disk, 
or  the  MSS  it  can  be  used  again  with  a  much  faster  job  turnaround 
time.  However,  long-term  storage  of  large  data  sets  on  MSS  or 
public  disks  is  not  recommendea. 


Example  4i  Analyze  data  stored  on  MSS.  The  example  below 
retrieves  data  from  an  existing  data  set  named  HGTRANK  (line  b), 
stored  on  MSS  in  Example  3.  Because  the  data  set  is  already  on 
the  MSS,  the  user  can  access  it  quickly  (no  tapes  are  required), 
manipulate  the  data,  and  perform  analyses  without  recreating  the 
working  file.  Note  that  HGTRANK2  in  line  9  is  a  temporary  data 
set  that  will  be  used  only  for  the  duration  of  the  job.  A 
correlation  between  two  of  the  variables  in  the  data  set  on  MSS 
is  calculated  after  the  alphabetical  codes  for  Variable  TGRA  are 
converted  to  numbers. 


1 

//XXXEX3 

JOB  (YYYY,860 

B) , KELLY 

2 

/*ACCESS 

WRZIKFD 

3 

/* ROUTE  XEQ  MSS 

4 

//PROCLIB 

DD  DSN=2ABCRUN. PROCLIB, DISP=SHR 

5 

//STEPl 

EXEC 

SAS,  REGK 

:N=4000K 

6 

//FYALLl 

DD 

DSN=YYYYXXX.HG  '  ’’Al-iK ,  DI  SP  =  SH  R,  UNIT=MSS 

7 

//SASLIB 

DD 

DSN=W  RZ  IKFD.  OLk:B  .FORMATS,  DISP=SHR 

8 

//SYSIN 

DD 

★ 

9 

DATA  HGTRAtJK2; 

10 

SET  FYALLl .HGTRANK 

r 

11 

IF  TGRA= 

'  WOl  ' 

THEN 

TGRA1=1 

12 

IF  TGRA= 

’CW2  ' 

THEN 

TGRA1=2 

13 

IF  TGRA= 

'CW3  ' 

THEN 

TGRA1=3 

14 

IF  TGRA= 

'  CVM  ' 

THEN 

TGRA1=4 

15 

IF  TGRA= 

’  2LT' 

THEN 

TGRA1  =  5 

16 

IF  TGRA= 

•  ILT  ' 

THEN 

TGRA 1=6 

17 

IF  TGRA= 

■CPT' 

THEN 

TGRA1=7 

18 

IF  TGRA= 

'  MAJ  ' 

THEN 

TGRAl=a 

19 

IF  TGRA= 

'LTC  ' 

THEN 

TGRA1=9 

20 

IF  TGRA= 

’COL’ 

THEN 

TGRA1=10; 

17 


21  IF  TGRA='B  G'  THEN  TGRA1=11 

22  IF  TGRA^'M  G'  THEN  TGRA1=12 

23  IF  TGRA=’LTG'  THEN  TGRA1=13 

24  IF  TGRA^’GEN'  THEN  TGRAl=i4 

25  IF  TGRA=’G  A'  THEN  TGRA1=15 

2b  PROG  CORK  VAR  TGRAl ; 

27  WITH  HEIGHT; 


Example  5:  Merge  variables  from  a  tape  and  the  MSS  by  MATCHCOD, 
create  a  new  data  set  to  store  on  tape.  This  example  takes  the 
variable  BAER  off  of  the  data  set  BRN85A  and  the  variable  TGRA 
from  the  data  set  HGTRANK  (line  16).  The  first  data  set  is 
contained  on  tape  number  034050,  named  WRZ IKFD . OMF • SDSB  (lines  8- 
9).  The  second  data  set  has  been  stored  as  an  MSS  data  set 
HGTRANK  (line  10)  in  the  Exctmple  3.  Recall  that  the  data  set 
HGTRAl^K  contains  data  from  two  other  data  sets.  By  merging 
HGTRANT.  with  BRN85A,  a  new  separate  data  set  has  been  created 
(BABRTGRA,  line  15)  with  variables  from  three  separate  data  sets 
and  stored  on  tape  (lines  11-12)  under  the  user's  account.  NIH 
computer  users  are  limited  to  using  two  tapes  in  a  B  class  job. 

By  merging  previously  stored  data  sets  (derived  from  the  data 
sets  on  tapes)  with  data  sets  on  one  or  two  tapes,  they  can  avoid 
having  to  run  C  class  jobs.  C  class  jobs  only  run  overnight. 

(See  the  NIH  Computer  User's  Guide  for  descriptions  of  job 
classes . ) 


1  //.\XXEX5  JOB  (YYYY,  860,  B)  , KELLY 

2  /* ACCESS  WRZ IKFD 

3  /*ROUTE  XEQ  MSS 

4  //PROCLIB  DD  DSN=Z ABCRUN . PROCLIB , DI SP=SHR 

5  //STEPl  EXEC  SAS, REGION=4000K 

6  /* ROUTE  XEO  TAPE 

7  /‘MESSAGE  034050 , R; 999999, W 

8  //FYALLl  DD  DSN=WRZ1KFD. OMF . SDSB, DISP=SHR, LABEL= ( 1 , SL) , 

9  //  UNIT=TAPE, VOL=SER=(034050) 

10  //FYALL2  DD  DSN=WRZlKl D. HGTRANK, DISP=SHR, UNIT=MSS 

11  //PERM  DD  DSN=YiYrYXXX.  BABRTGRA,  UNIT=TAPE,DISP=(NEW,  KEEP), 

12  //  V0L=(PR1VATE, SER=999999) 

13  //SASLIB  DD  DSN=W RZ IKFD . OLRDB . FORMATS , DI SP=SHR 

14  //SYSIN  DD  * 

15  DATA  PERM. BABRTGRA; 

16  MERGE  FYALLl .BRN85A(KEEP=BABR)  FYALL2 . HGTRANK (KEEP^TGRA)  ; 

17  BY  MATCHCOD; 

18  PROC  FREQ; 

19  TABLES 

20  BABR*TGRA; 


